You are an AI assistant that aids users in performing data analysis using Python and Pandas to find information.

There is the dataset: you have the datafastfoodcontains the following columns:['restaurant', 'item', 'calories', 'cal_fat', 'total_fat', 'sat_fat', 'trans_fat', 'cholesterol', 'sodium', 'total_carb', 'fiber', 'sugar', 'protein', 'vit_a', 'vit_c', 'calcium', 'salad'].
--- The description for each column this data is:
restaurant: Name of fast food restaurant (McDonalds, Chick Fil-A, Sonic...etc)
item: Menu item at the restaurant (Artisan Grilled Chicken Sandwich, Single Bacon Smokehouse Burger, and Big Mac...etc) 
calories: Energy content of the food item (measured in calories) 
cal_fat: Calories in the item come from fats (measured in calories)
total_fat: Total amount of fat in the item (measured in grams)
sat_fat: The amount of saturated fat in the item (measured in grams) 
trans_fat: Trans fat content of the item (measured in grams) 
cholesterol: Cholesterol content of the item (measured in milligrams)
sodium: Sodium content of the item (measured in milligrams)
total_carb: Total carbohydrates in the item (measured in grams)
fiber: Dietary fiber content of the item (measured in grams)
sugar: Total sugar content in the item (measured in grams)
protein: Total protein content in the item (measured in grams)
vit_A: Vitamin A content in the item (measured in a percentage of the recommended daily intake) 
vit_C: Vitamin C content in the item (measured in a percentage of the recommended daily intake) 
calcium: Calcium content in the item (measured in a percentage of the recommended daily intake) 
salad: Type of food item (main course, side dish, or dessert)
---
Considering contents from the dataset and result types from user, you only need to generate codes and notations. Please note DO NOT CHANGE FILE AND VARIABLE NAMES THAT I HAVE SET!
Conversation begins:
[USER (data scientist)]:Please filter the dataset to include only main course items such as sandwiches, wraps, and salads, and exclude side dishes and desserts. Then, provide the filtered dataset containing only main course items.

[YOU (AI assistant)]:To filter the dataset to include only main course items, I need to know the criteria to identify main course items such as sandwiches, wraps, and salads. Please provide a condition or a list of keywords that can be used to filter the main course items.

[USER (data scientist)]:Use the following keywords to filter the main course items: 'sandwich', 'wrap', 'salad', 'burger', 'burrito', 'taco'. Make sure the filtering is case-insensitive. Please load the 'fastfood.csv' dataset into a DataFrame, then filter it to include only rows where the 'item' column contains one of several keywords related to fast food items (making the search case-insensitive), and finally, save the filtered DataFrame as a pickle file.

[YOU (AI assistant)]: Sure, I can help you with that. :
'''
import pandas as pd   
import pickle

# Load the dataset    
fastfood = pd.read_csv("fastfood.csv")  

# Assuming the dataset is loaded as a DataFrame named 'fastfood'   
keywords = ['sandwich', 'wrap', 'salad', 'burger', 'burrito', 'taco']   
   
# Make the filtering case-insensitive   
fastfood['item_lower'] = fastfood['item'].str.lower()   
   
# Filter the dataset using the keywords   
filtered_fastfood = fastfood[fastfood['item_lower'].str.contains('|'.join(keywords))]   
   
# Drop the 'item_lower' column as it's not needed anymore   
filtered_fastfood = filtered_fastfood.drop(columns=['item_lower'])   

print(filtered_fastfood)

'''

[USER (data scientist)]:Great, now that we have the filtered dataset, let's create a scoring system that takes into account the nutritional factors (calories, total fat, saturated fat, trans fat, cholesterol, sodium, sugar, and protein), with more weight given to calories, total fat, saturated fat, and sodium. Please provide a scoring system formula. Please generate a DataFrame and a pickle file to evaluate the nutritional quality of fast food items by calculating the percentage of Recommended Daily Intake (RDI) for various nutrients and a Comprehensive Modified NRF Index.

[YOU (AI assistant)]: One of the widely used nutrient profiling systems in the US is the Nutrient Rich Foods (NRF) Index. The NRF Index is based on the concept of nutrient density, which is the ratio of beneficial nutrients to the energy content of a food item. The NRF Index takes into account both beneficial nutrients (protein, fiber, vitamins, and minerals) and nutrients to limit (saturated fat, added sugar, and sodium). 
Here's the formula for the NRF Index: NRF Index = (Beneficial Nutrients Score - Nutrients to Limit Score) 
 
Our dataset does not include all the beneficial nutrients considered in the NRF Index, but we can create a modified version using the available data:Comprehensive Modified NRF Index = (Protein + Fiber + Vit_A + Vit_C + Calcium) - (Sat_Fat + Trans_Fat + Sugar + Sodium + Total_Carb) 
To calculate the Comprehensive Modified NRF Index using the column names, you can follow these steps: 
Normalize the nutrient values by dividing them by the recommended daily intake (RDI) for each nutrient. 
Multiply the normalized values by 100 to get the percentage of RDI. 
Sum the percentage of RDI for beneficial nutrients and nutrients to limit separately. 
Subtract the sum of nutrients to limit from the sum of beneficial nutrients.
Here's the code to calculate the Comprehensive Modified NRF Index using the column names :
'''
 
import pandas as pd   
import pickle
   
# Assuming the dataset is loaded as a DataFrame named 'fastfood'   
   
# Recommended Daily Intake (RDI) values   
rdi_protein = 50   
rdi_fiber = 25   
rdi_vit_a = 5000   
rdi_vit_c = 60   
rdi_calcium = 1000   
rdi_sat_fat = 20   
rdi_trans_fat = 2   
rdi_sugar = 50   
rdi_sodium = 2400   
rdi_total_carb = 300   
   
# Calculate the percentage of RDI for each nutrient   
fastfood['protein_score'] = (fastfood['protein'] / rdi_protein) * 100   
fastfood['fiber_score'] = (fastfood['fiber'] / rdi_fiber) * 100   
fastfood['vit_a_score'] = (fastfood['vit_a'] / rdi_vit_a) * 100   
fastfood['vit_c_score'] = (fastfood['vit_c'] / rdi_vit_c) * 100   
fastfood['calcium_score'] = (fastfood['calcium'] / rdi_calcium) * 100   
fastfood['sat_fat_score'] = (fastfood['sat_fat'] / rdi_sat_fat) * 100   
fastfood['trans_fat_score'] = (fastfood['trans_fat'] / rdi_trans_fat) * 100   
fastfood['sugar_score'] = (fastfood['sugar'] / rdi_sugar) * 100   
fastfood['sodium_score'] = (fastfood['sodium'] / rdi_sodium) * 100   
fastfood['total_carb_score'] = (fastfood['total_carb'] / rdi_total_carb) * 100   
   
# Calculate the Comprehensive Modified NRF Index   
fastfood['comprehensive_modified_nrf_index'] = (fastfood['protein_score'] + fastfood['fiber_score'] + fastfood['vit_a_score'] + fastfood['vit_c_score'] + fastfood['calcium_score']) - (fastfood['sat_fat_score'] + fastfood['trans_fat_score'] + fastfood['sugar_score'] + fastfood['sodium_score'] + fastfood['total_carb_score']) 

print(fastfood)

'''

[USER (data scientist)]:Great, now that we have the filtered dataset, let's create a scoring system that takes into account the nutritional factors (calories, total fat, saturated fat, trans fat, cholesterol, sodium, sugar, and protein), with more weight given to calories, total fat, saturated fat, and sodium. Please provide a scoring system formula.

[YOU (AI assistant)]: One of the widely used nutrient profiling systems in the US is the Nutrient Rich Foods (NRF) Index. The NRF Index is based on the concept of nutrient density, which is the ratio of beneficial nutrients to the energy content of a food item. The NRF Index takes into account both beneficial nutrients (protein, fiber, vitamins, and minerals) and nutrients to limit (saturated fat, added sugar, and sodium). 
[USER (data scientist)]: Generate the code to calculate the health score for each item in the filtered dataset using this scoring system.  The dataset should contain the health score for each item calculated using the provided formula. Please normalize nutrient values and calculate a health score for each item in the 'filtered_fastfood' DataFrame, resulting in an updated DataFrame and a saved pickle file containing the modified data.

My template of code snippet is:

---BEGIN CODE TEMPLATE--- 

import pandas as pd
import pickle

# Read the fastfood data 
fastfood = pd.read_csv("fastfood.csv")  

# YOUR SOLUTION BEGIN:
<code1>
[GENERATE YOUR CODE]
</code1>
# YOUR SOLUTION END: 

print(filtered_fastfood)
pickle.dump(filtered_fastfood,open("./pred_result/filtered_fastfood.pkl","wb"))

---END CODE TEMPLATE---

Please note that you have to generate the WHOLE python code instead of code segments based on the code snippet using Pandas library 2.0.3 version and Matplotlib library 3.7.4 version. You must keep all comments in code snippet unchanged.

[YOU (AI assistant)]: Sure, I can help you with that.
